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ABSTRACT 



Apparatus and methods are provided for encoding video 
data in a manner which significantly reduces the computa- 
tion performed by the video encoder and the video decoder 
without suffering any degradation in the perceived quality of 
the compressed video data. In particular, apparatus and 
methods are provided for determining which blocks might 
be zeroed out after quantization. This determination is 
performed after motion estimation, the classification of the 
frame as either an I frame, P frame, or a B frame, and the 
determination of a quantization step size (QP) for the block, 
but before DCT. If a given block is determined to be a "zero" 
quantized block, then the DCT, quantization, zig-zag scan 
and variable length coding steps are omitted, and a variable 
length code output is provided indicating that the block B is 
a "zero" quantized block. The present invention determines 
which blocks might be zeroed out after quantization by 
using one or more key features of the motion compensated 
blocks which will help in classifying these blocks into zero 
and nonzero blocks. Examples of these features include the 
mean absolute value of a block, the mean square error of a 
block, the block variance, the mean absolute difference of a 
block, and the maximum value in a block. Each feature is 
provided with a predetermined threshold that has been 
experimentally calculated. 

26 Claims, 10 Drawing Sheets 
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METHOD AND APPARATUS FOR MOTION 10 (see FIG. 2), there could be 720 columns and 640 rows 

COMPENSATED VIDEO CODING of pels. Since each pel contains 8 bits of data (for luminance 

data), each frame 10 could have over three million bits of 

This application is a continuation of the U.S. patent data ( for luminance data). If we include chrominance data, 

application Ser. No. 09/006,972, filed on Jan. 14, 1998, 5 each pel has up to 24 bus of data so that this number 'is even 

which is hereby incorporated by reference. g reater - ^ lar 8 e 1™** of da,a " for data 

J r ' storage or transmission because most applications nave 

BACKGROUND OF THE INVENTION limited storage (i.e., memory) or limited channel bandwidth. 

To respond to the large quantity of data that has to be stored 

1. Field of the Invention or transmitted, techniques have been provided for compress- 
The present invention relates generally to video coding, mg tne d ata fr 0m ooe frame 10 or a sequence of frames 10 

and in particular, to pre -quantization of motion compensated t0 p rov ide an output that contains a minimal amount of data, 

blocks for video coding at very low bit rates. The present -j^ process G f compressing large amounts of data from 

invention provides a method and an apparatus for signifi- successive video frames is called video compression, and is 

cantly reducing the number of computations at a video performed in the video encoder 12. 

encoder. During conventional video encoding, the video encoder 

2. Background Art 12 will take each frame 10 and divide it into blocks. In 
FIG. 1 illustrates the general structural blocks that are particular, each frame 10 can be first divided into macrob- 

used for, and the steps involved in, the conventional digital locks MB, as shown in FIG. 2. Each of these macroblocks 

coding of a sequence of video images. In particular, the 2Q MB can have, for example, 16 rows and 16 columns of pels, 

video image is made up of a sequence of video frames 10 Each macroblock MB can be further divided into four blocks 

that are captured, such as by a digital camera, and transmit- B, each block having 8 rows and 8 columns of pels. Once 

ted to a video encoder 12. The video encoder 12 receives the each frame 10 has been divided into blocks B, the video 

digital data on a frame-by-frame and macroblock-by- encoder 12 is ready to compress the data in the frame 10. 

macroblock basis, and applies a video encoding algorithm to 25 FIG. 3 illustrates the different steps, and the possible 

compress the video data. In some applications, the video hardware components, that are used by the conventional 

encoding algorithm can also be implemented in hardware. video encoder 12 to carry out the video compression. Each 

The video encoder 12 generates an output which consists of frame 10 is provided to a motion estimation engine 30 which 

a binary bit stream 14 that is processed by a modulator 16. performs motion estimation. Since each frame 10 contains a 

The modulator 16 modulates the binary bit stream 14 and 30 plurality of blocks B, the following steps will process each 

provides the appropriate error protection. The modulated frame 10 on a block-by-block basis, 

binary bit stream 14 is then transmitted over an appropriate Motion estimation calculates the displacement of one 

transmission channel 18, such as through a wireless con- f rame j n a sequence with respect to the previous frame. By 

nection (e.g., radio frequency), a wired connection, or via calculating the displacement on a block basis, a displaced 

the Internet. The transmission can be done in an analog 35 fj am e difference can be computed which is easier to code, 

format (e.g., over phone lines or via satellite) or in a digital thereby reducing temporal redundancies. For example, since 

format (e.g., via ISDN or cable). The transmitted binary bit me background of a picture or image usually does not 

stream 14 is then demodulated by a demodulator 20 and change, the entire frame does not need to be encoded, and 

provided to a video decoder 22. The video decoder 22 takes on i y tne m0 ving objects within that frame (i.e., representing 

the demodulated binary bit stream 24 and converts or 40 tne differences between sequential frames) need to be 

decodes it into sequential video frames. These video frames encoded. Motion estimation will predict how much the 

are then provided to a display 26, such as a television screen moving object will move in the next frame based on certain 

or monitor, where they can be viewed. If the transmission motion vectors, and will then take the object and move it 

channel 18 utilizes an analog format, a digital-to-analog f rorn a previously reconstructed frame to form a predicted 

converter is provided at the modulator 16 to convert the 45 frame. At the video decoder 22, the previously reconstructed 

digital video data to analog form for transmission, and an f rame> together with the motion vectors used for that frame, 

analog-to-digital converter is provided at the demodulator w \\\ re p ro duce the predicted frame at the video decoder 22 

20 to convert the analog signals back into digital form for known as "motion compensation"). The predicted 

decoding and display. frame is then subtracted from the previously reconstructed 

The video encoding can be embodied in a variety of ways. 50 frame to obtain an "error" frame. This "error" frame will 

For example, the actual scene or image can be captured by contain zeros at the pels where the background did not move 

a camera and provided to a chipset for video encoding. This from the previously reconstructed frame to the predicted 

chipset could take the form of an add-on card that is added frame. Since the background makes up a large part of the 

to a personal computer (PC). As another example, the picture or image, the "error" frame will typically contain 

camera can include an on-board chip that performs the video 55 many zeros. 

encoding. This on-board chip could take the form of an Each frame 10 can be either an "intraframe" (also known 

add-on card that is added to a PC, or as a separate stand- as «j» frame) or an "interframe" (also known as "P" frame), 

alone video phone. As yet another example, the camera Each I frame is coded independently, while each P frame 

could be provided on a PC and the images provided directly depends on previous frames. In other words, a P frame uses 

to the processor on the PC which performs the video 60 temporal data from previous P frames to remove temporal 

encoding. redundancies. An example of a temporal redundancy can be 

Similarly, the video decoder 22 can be embodied in the the background of an image that does not move from one 

form of a chip that is incorporated either into a PC or into frame to another, as described above. For example, the 

a video box that is connected to a display unit, such as a "error" frame described above would be a P frame. In 

monitor or television set. 65 addition to I and P frames, there also exists another type of 

Each digital video frame 10 is made up of x columns and frame, known as a "B" frame, which uses both previous and 

y rows of pixels (also known as "pels"). In a typical frame future frames for prediction purposes. 
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Now, referring back to FIG. 3, all digital frames 10 binary code for each event. These binary codes are output as 

received from the motion estimation engine 30 are provided the binary bitstream 14 described above. These unique 

to a frame-type decision engine 40, which operates to divide binary codes can be recognized by the video decoder 22 and 

all the incoming frames 10 into I frames, P frames and B decoded by the video decoder 22 into the original values 

frames. Whether a frame 10 becomes an I, P or B frame is 5 (i.e., non-zero values followed by runs of zeros), 

determined by the amount of motion experienced by that Thus, the conventional video encoder 12 and its 

frame 10, the degradation of distortion, type of channel operation, as illustrated in FIG. 3, function to minimize (i.e., 

decisions, and desired user parameters, among other factors. compress) the large number of bits at the input blocks B of 

From this point onward, all I, P and B frames are processed eac h frame 10 (see FIG. 2) to a minimal number of bits at 

in the same manner. 10 the bitstream 14, taking advantage of the fact that the DCT 

Each block B from each frame 10 is now provided to a QP and quantization steps will produce multiple runs of zeros, 

decision engine 50 which determines a QP or quantization The transmitted bitstream 14 is decoded by the video 

step size number for the block or groups of blocks. This QP decoder 22 by reversing the steps performed by the video 

number is determined by a rate control mechanism which encoder 12. 

divides a fixed bit budget of a frame among different blocks, 35 The values in each frame 10 can represent different 

and is used by the quantization engine 80 to carry out meanings. For example, in an I frame, each value can range 

quantization as described below. f rom 2&ro to 255, with zero representing the darkest (or 

Each block B is now provided to a DCT engine 60. DCT black) pel, and 255 representing the brightest pel. In a P 

of individual blocks helps in removing the spatial redun- frame, each value can range from -128 to +127, with -128 

dancy by bringing down the most relevant information into 20 and +127 representing the maximum residual value possible 

the lower most coefficients in the DCT domain. DCT can be or a lot of edge information, and zero representing no 

accomplished by carrying out a Fourier-like transformation residual. 

of the values in each block B. DCT produces a transformed While the above-described conventional video encoder 12 
block 70 in which the zeros and lower values are placed in an d method is effective in compressing the amount of data 
the top left corner 72 of the transformed block 70, and the to be transmitted, it requires much computation and there- 
higher frequency values are placed in the bottom right fore increases the time and cost of the video encoder 12 and 
corner 74. video decoder 22. In particular, motion estimation is the 

After having obtained a block 70 of DCT coefficients most computationally intensive part of the video encoding 
which contain the energy of the displaced blocks, quantiza- 3Q process, and often accounts for more than half of the 
tion of these blocks 70 is performed by quantization engine processing. For this reason, many video encoding solutions 
80. Quantization is a uniform quantization with a step size prefer to perform motion estimation either by using dedi- 
(i.e., the predetermined QP) varying within a certain range, cated hardware, or by some fast sub-optimal software 
such as from 2 to 62. It is implemented as a division, or as scheme. Dedicated hardware can be realized as an ASIC 
a table look-up operation for a fixed-point implementation, 35 (Application Specific Integrated Circuit) or as an FPGA 
of each value in the transformed block 70. For example, the (Field Programmable Gate Array). While dedicated hard- 
quantization level for each value in the block 70 can be ware provides fast and accurate motion estimation, it can be 
determined by dividing the value by 2QP. Therefore, if QP very expensive. As a result, software schemes are often 
is 10 and a value in the block is 100, then the quantization preferred because they are less expensive. These software 
level for this level is equal to 100 divided by 2QP, or 5. At 4Q schemes achieve fast motion estimation by doing a subop- 
the video decoder 22 in FIG. 1, the value is reconstructed by timal search using the inherent processor of a PC or work- 
multiplying the quantization level (i.e., 5) by 2QP to obtain station. Unfortunately, the motion estimation performed by 
the original value of 100. Thus, quantization takes a finite set these software schemes are generally less accurate, 
of values and maps the set of values, providing a quantized Although motion estimation is the most computationally 
block 90 where the top left corner 92 contains higher 45 intensive part of the video encoding process, DCT, 
quantized levels, and the bottom right corner 94 contains quantization, zig-zag scanning and variable length coding 
mostly zeros. are also computationally intensive. Unfortunately, in the 

Next, the quantized block 90 is provided to a zig-zag scan conventional video encoding and decoding method, all 

engine 100 which performs a zig-zag scan of the values in frames 10 must go through DCT, quantization, zig-zag scan 

the block 90, The direction of the scan is illustrated in FIG. 50 and variable length coding. 

4, and begins from the top left corner 92, which contains the Thus, there still remains a need for a video encoder and 

higher quantized levels, through the middle of the block 90 method which significantly reduces the computation pcr- 

and to the bottom right corner 94, which contains mostly formed by the video encoder and the video decoder without 

zeros. The zig-zag scan produces a zig-zag scan block 110 suffering any degradation in the perceived quality of the 

in which the quantized values from the quantized block 90 55 compressed video data, 
are positioned linearly across the zig-zag scan block 110. 

Therefore, zig-zag scan emulates going from a lower to a SUMMARY OF THE INVENTION 

higher frequency, thereby resulting in long runs of zeros in The present invention provides apparatus and methods for 

the zig-zag scan block 110. encoding video data in a manner which significantly reduces 

The values in the zig-zag scan block 110 are then provided 60 the computation performed by the video encoder and the 

to a variable length coding engine 120 where entropy coding video decoder without suffering any degradation in the 

is performed. Traditionally, most video coding standards use perceived quality of the compressed video data. In 

huffman coding for entropy coding. First, a non-zero value particular, the present invention provides apparatus and 

followed by runs of zeros is encoded as a single "event". For methods for determining which blocks might be zeroed out 

example, "400000000000" and "10000000000000" would 65 after quantization. This determination is performed after 

each be encoded as separate single events. Entropy coding motion estimation, the classification of the frame as either an 

is then performed on these events to generate a unique I frame, a P frame, or a B frame, and the determination of 
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a quantization step size (QP) for the block, but before DCT. DETAILED DESCRIPTION OF THE 

If a given block is determined to be a "zero" quantized INVENTION 

block, then the video encoder and method of the present , Q , he following description, for purposes of explanation 

invention skips or omits the DCT, quantization, zig-zag scan and QOt i imital i onf specific details are set forth in order to 

and variable length coding steps, and merely transmits a 5 provide a thorough understanding of the present invention, 

variable length code output indicating that the block B is a However, it will be apparent to one skilled in the art that the 

"zero" quantized block. present invention may be practiced in other embodiments 

The present invention determines which blocks might be that depart from these specific details. In certain instances, 

zeroed out after quantization by using one or more key detailed descriptions of well-known or conventional data 

features of the motion compensated blocks which will help 10 processing techniques, hardware devices and circuits are 

in classifying these blocks into zero and non-zero blocks. omitted so as to not obscure the description of the present 

Examples of these features include the mean absolute value invention with unnecessary detail. 

of a block, the mean square error of a block, the block ^ videQ erjCod j ng method and apparatus according to 

variance, the mean absolute difference of a block, and the the preserjt invention is based on the observation that a 

maximum value in a block. Each feature is provided with a ™ ma j ority of quantized blocks 90 are zero (i.e., all the values 

predetermined threshold that has been experimentally cal- in the bIock 90 are zero ) Even though these "zero" quan- 

culated. Each of these features is easy to compute and is (ized blocks 90 would only produce a s i mple variable length 

based on first or second ordered moments of the motion code outpm 14 these ^ r0 » quantized blocks 90 would still 

compensated blocks. have t0 g0 through DCT, quantization and variable length 

In the present invention, each feature is used to compute 20 coding at the video encoder 12. This means that if it can be 

a value for a given block, with the feature value of that block detected a priori which blocks 90 would be zeroed out after 

compared to its predetermined threshold. The block is quantization, the hardware and steps for DCT, quantization, 

classified as a zero quantized block if the feature value of and variable length coding can be omitted, thereby signifi- 

that block is less than or equal to the threshold. cantly reducing the computations required for a large num- 

In accordance with one embodiment of the present ber of blocks B. This in turn would translate to a significant 

invention, one or more features may be used jointly to reduction in computations at the video encoder 12 and the 

determine whether the block should be classified as a zero video decoder 22. 

quantized block. In accordance with another embodiment of As a result, the present invention provides apparatus and 

the present invention, a selector is provided to select one of 3Q methods for determining which blocks might be zeroed out 

a plurality of features to be used in the determination. after quantization with minimal or no loss in the perceived 

Thus, by skipping the DCT, quantization, zig-zag scan quality of the compressed video data. This determination is 

and variable length coding steps for zero-quantized blocks, performed prior to the DCT step. If a given block B exiting 

the present invention significantly reduces the computations from the QP engine is determined to be a "zero" quantized 

required by the video encoder. There is also little or no 35 block, then the video encoder of the present invention skips 

degradation in the perceived quality of the compressed video or omits the DCT, quantization, zig-zag scan and variable 

data because the thresholds for the features can be experi- length coding steps, and merely transmits a variable length 

mentally determined and selected so as to minimize the code output indicating that the block B is a "zero" quantized 

potential of misclassifying blocks. block. 

40 FIG. 5 provides a general schematic illustration of the 

BRIEF DESCRIPTION OF THE DRAWINGS present invention as -embodied in a video encoder 12a 

. ... L . . | , - , i according to the present invention. The video encoder 12a is 

FIG 1 illustrates the general structural blocks hat are * ^ J nventional video encoder 12 msM in 

used for and the steps involved in, the conventional digital pjQ 3 with the engines and transformed blocks of FIG. 5 

coding of a sequence of video images. ^ bemg assigned ^ same numerals as the correS ponding 

FIG. 2 is a simplified illustration of one frame of video engines and transformed blocks of FIG. 3, except that an "a" 

data, and the macroblocks and blocks that make up this has been added t0 the engines and transformed blocks of 

frame. FIG. 5. Therefore, the operation of the common components 

FIG. 3 illustrates the different steps, and the hardware in FIGS. 3 and 5 will not be repeated herein, 

components, that are used by the video encoder of FIG. 1 to 50 As illustrated in FIG. 5, the video encoder 12a according 

carry out conventional video compression. to the present invention provides a zero block locater engine 

FIG. 4 is a simplified illustration of how a quantized block 200 between the QP engine SOa and the DCT engine 60a. If 

of data is scanned in a zig-zag manner. a given block B exiting from the QP engine SOa is deter- 

F1G. 5 illustrates the different steps, and the hardware mined to be a "zero" quantized block 90a, then the zero 

components, that are used by the video encoder according to 55 block locater engine 200 merely transmits a variable length 

one embodiment of the present invention. code 0Ut P ul indicating that the block B is a "zero" quantized 

FIG. 6 illustrates a first embodiment of the zero block block 90 / n Otherwise, the block B is transmitted to the DCT 

locater endne of FIG 5 engine 6 and p throu & h the DCT » quantization, zig-zag 

* scan and variable length coding steps to produce a binary bit 

FIG. 7 illustrates a second embodiment of the zero block 6Q slream 14 ^ fe omput lQ lhe moduIator 16 

locater engine of FIG. 5. The wjq block lQcatcr engme 20Q u|flizcs 0Qe Qr more 

FIG. 8 illustrates a third embodiment of the zero block key f eature s 0 f the motion compensated blocks which will 

locater engine of FIG. 5. ne lp m classifying these blocks into zero and non-zero 

FIG. 9 is a scatter plot showing SAE and MAX for all the blocks. A preferred feature is one that would allow accurate 

blocks for coding of the sequence "earphone" at 20 Kbps. 65 classification of the blocks while involving minimal com- 

HG. 10 illustrates the distribution of zero blocks and one putations. A few non-limiting features are set forth here in - 

blocks in SAE- MAX feature space. below. Each of these features are easy to compute and are 
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based on first or second ordered moments of the motion 
compensated blocks. 

a. "Mean Absolute Error" or SAE 

A first feature is the "mean absolute error", or SAE, which 
is calculated as follows: 



i=i >i 



where f(i, j) is a pel value of the block. SAE involves 
summing the magnitude of the values of all the pels in a 
block. Therefore, assuming that the values in a given block 
B exiting the OP engine 50a is [0, -3, -4, 0, 0, ... , 7], the 
SAE value will be [0+3+4+0+. . . +7>14. SAE is a good 
indicator of the activity of a block. A high SAE value will 
usually indicate that the block contains significant residual, 
indicating that the motion estimation was not efficient for 
this block. 

b. "Mean Square Error" or MSE 

A second feature is the "mean absolute error", or MSE, 
which is calculated as follows: 



I- I Jml 



where f(i, j) is a pel value of the block, and 



W Jml 



where u is essentially the mean of all the values in the block. 
VAR indicates how much the block varies within itself. 

d. "Mean Absolute Difference" or MAD 

A fourth feature is the "mean absolute difference", or 
MAD, which is calculated as follows: 



35 



20 



25 



30 



where f(i, j) is a pel value of the block. MSE involves 
summing the square of the values of all the pels in a block. 
Therefore, assuming that the values in a given block B 
exiting the QP engine 50a is [0, -3, -4, 0, 0, ... , 7], the 
MSE value will be [0+9 +16+0+. . . +49>74. MSE is a good 
indicator of how much change a block has experienced since 
its original image. 

c. "Block Variance" or VAR 

A third feature is the "block variance", or VAR, which is 
calculated as follows: 



35 



40 



45 



50 



55 



60 



65 



where f(i, j) is a pel value of the block, and 



MAD indicates the local variance within a block, and is 
similar to VAR. 

e. "Maximum Value" or MAX 

A fifth feature is the "maximum value", or MAX, of the 
pel values in the block: 

MAX-max{/[/jy«0, . . . ,jy-0, . . . ,y} 

where f(i, j) is a pel value of the block. MAX is a good 
indicator of the presence of any edges which should be 
detected because edge values tend to be higher. 

Any or all of the above features can be used in classifying 
the motion compensated blocks as "zero" or "one". A "zero" 
block is defined as a block with all zero quantized DCT 
coefficients and a "one" block is a block with at least one 
non-zero quantized DCT coefficient. 

For doing the classification, one or more of the above 
features are selected, and each uses an experimentally deter- 
mined threshold to decide whether or not a block should be 
classified as a "zero" block. If the value calculated by the 
selected feature (also referred to as "feature value") is 
greater than the specific threshold for that feature, then the 
block is classified as a "one" block, otherwise it is classified 
as a "zero" block. Thus, these thresholds can be selected to 
minimize the potential of erroneously classifying blocks 
(e.g., misclassify a "zero" block as a "one" block, or vice 
versa). 

The thresholds for all the above-described features are 
experimentally determined. To determine the optimal 
threshold for each feature, the methods described in FIGS. 
6-8 below are implemented for a given sequence of blocks 
by using different threshold values. The accuracy provided 
by each threshold is then compared with the actual result 
(i.e., which of the sequence of blocks are actual "zero" 
blocks?) obtained by putting that sequence of blocks through 
the entire encoding process of FIG. 3. The "misdetection 
rate" and the "false alarm rate" can then be measured for 
each threshold of each feature. A small "misdetection rate" 
means that most actual "zero" blocks will be classified as 
"zero" blocks. A negligible "false alarm rate" means that a 
very small number of true "non-zero" blocks will be clas- 
sified as "zero" blocks. The experimentation can be iterated 
by changing the thresholds until the desired "misdetection 
rate" and "false alarm rate" have been obtained for a given 
feature. Ideally, low values for both the "misdetection rate" 
and "false alarm rate" means that the perceived quality of the 
encoded video sequence will be good. 

The thresholds for each feature can be varied, recognizing 
that there will be a trade-off between the amount of com- 
putational savings and the overall perceived quality of the 
encoded video sequence. For example, by selecting slightly 
lower thresholds, the system 12a can further reduce the 
number of computations (because certain "one" blocks may 
be classified as "zero" blocks and therefore not processed), 
but there will also be slight degradation of the perceived 
quality of the encoded video sequence. In a preferred 
embodiment of the present invention, the thresholds for each 
feature are selected so that there is no degradation in the 
perceived quality of the encoded video sequence. Even 
though the present invention would select higher thresholds, 
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the present invention still provides large computational 
savings, as described herein. 

The thresholds can also be varied for different blocks 
and/or frames during the encoding process, or the same 
threshold can be used for the entire encoding process. For 5 
example, a certain higher threshold can be used during a 
certain portion of the encoding for a particular sequence of 
video images where degradation in the perceived quality of 
the encoded video sequence is less tolerated, while a lower 
threshold can be used during other portions of the encoding no 
where some degradation in the perceived quality of the 
encoded video sequence can be tolerated. 

In addition, different types of frames (i.e., I, P or B) can 
be provided with different experimentally-determined 
thresholds. These frames may be provided with different 15 
thresholds because different types of data are carried by the 
different types of frames. 

FIG. 6 illustrates one embodiment of the zero block 
locater engine 200a according to the present invention. The 
zero block locater engine 200a includes an SAE engine 210 20 
which computes the SAE for the particular block B exiting 
from the QP engine 50a. The SAE value of this block B is 
then provided to a comparator 215 which determines 
whether the SAE value is above a predetermined threshold. 
If the SAE value is greater than the SAE threshold, then the 25 
block B is classified as a "one" block and the values of the 
block B are provided to the DCT engine 60a for further 
processing as illustrated and described in connection with 
FIGS. 3 and 5. If the SAE value is less than or equal to the 
SAE threshold, then the block B is classified as a "zero" 30 
block and the comparator 215 transmits a variable length 
code output indicating that the block B is a "zero" quantized 
block. In accordance with a preferred embodiment of the 
present invention, the SAE threshold is selected to be K*QP, 
where K is equal to 20. It is also possible to vary this 35 
threshold within the encoding process (for different frames 
and/or blocks) by providing different values for K. 

SAE is selected as the preferred feature in the embodi- 
ment 210a of FIG. 6 because SAE is computationally less 
intensive than MSE, MAD or VAR. In addition, the inven- 40 
tors have determined that SAE and MSE exhibit the best 
performance, in other words, SAE and MSE minimize the 
potential of erroneously misclassifying blocks. 

FIG. 7 illustrates another embodiment of the zero block 
locater engine 200b according to the present invention. The 45 
zero block locater engine 2006 includes a plurality of feature 
engines, including for example, an SAE engine 220 which 
computes the SAE for the particular block B exiting from the 
QP engine 50a, an MSE engine 225 which computes the 
MSE for the particular block B exiting from the QP engine 50 
50a, and a MAX engine 230 which computes the maximum 
value for the particular block B exiting from the QP engine 
50a. Similar MAD and VAR engines can be provided as well 
if desired. The value of each feature is then provided to a 
separate comparator 235, 240 and 245, respectively, which 55 
determines whether the value of the feature is above the 
predetermined threshold for that feature. The output of the 
comparator 235, 240, 245 is in turn provided to a corre- 
sponding inverter or NOT gate 236, 241, 246, respectively. 
The outputs of these NOT gates are coupled to an AND gate 60 
250. 

If the value of a feature is greater than that feature's 
threshold, then the comparator 235, 240, 245 outputs a "1" 
to the respective NOT gate 236, 241 or 246. If the value of 
a feature is less than or equal to that feature's threshold, then 65 
the comparator 235, 240, 245 outputs a "0" to the respective 
NOT gate 236, 241 or 246. If all of the comparators 235, 
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240, 245 output a "0", then the block B is classified as a 
"zero" block and a variable length code output (indicating 
that the block B is a "zero" quantized block) is transmitted 
to the modulator 16. However, if the output of just one 
comparator 235, 240 or 245 is "1", then the block "B" will 
be classified as a "one" block and the values of the block B 
are provided to the DCT engine 60a for further processing 
as illustrated in connection with FIGS. 3 and 5. 

As described above, the thresholds for each feature can be 
experimentally determined, and possibly varied. As non- 
limiting examples, the following thresholds can be used for 
each of the following features: 

SAE: KSAE*QP 

MSE: KMSE*QP 

MAD: KMAD*QP 

VAR: KVAR*QP 

MAX: WQP 
where K for each feature is a constant that is empirically 
determined by extensive experimentation, as described 
above. As with the embodiment of FIG. 6, the thresholds for 
each feature can be varied by changing the values for K. 

Thus, the embodiment 2006 of FIG. 7 illustrates the use 
of a plurality of features to determine whether a block is a 
"zero" quantized block. The block is classified as a "zero" 
quantized block only if the values of all the features are 
below the respective thresholds for all the features. 
Otherwise, the zero block locater engine 2006 will perform 
the DCT, quantization, zig-zag scan and variable length 
coding steps for that particular block B. This embodiment 
has the benefit of minimizing the potential of erroneously 
misclassifying blocks because it will require that all the 
features be examined, but it does include the drawback of 
increasing the computations at the video encoder 12a. 

FIG. 8 illustrates yet another embodiment of the zero 
block locater engine 200c according to the present inven- 
tion. The zero block locater engine 200c includes a plurality 
of feature engines, including for example, an SAE engine 
300, an MSE engine 305, a MAD engine 310, a VAR engine 
315 and a MAX engine 320. A selector engine 325 receives 
the incoming block B from the QP engine 50a and selects 
one feature which is to be used in determining whether the 
block B is a "zero" quantized block. The selection of the 
specific feature is determined by measuring some local 
statistics based on a frame or a group of blocks. The feature 
selected can vary for different frames 10, or even for a group 
of blocks within a frame. The selector engine 325 then 
enables the engine for the selected feature, and the selected 
engine computes the value of the feature for that block B. 
The value of the selected feature is then provided to a 
corresponding comparator 330, 335, 340, 345 or 350, which 
determines whether the value of the selected feature is above 
the predetermined threshold for that feature. If the value of 
the selected feature is greater than that feature's threshold, 
then the comparator provides an output classifying the block 
B as a "one" block and the values of the block B are 
provided to the DCT engine 60a for further processing as 
illustrated in connection with FIGS. 3 and 5. If the value of 
the selected feature is less than or equal to that feature's 
threshold, then the block B is classified as a "zero" block and 
the comparator transmits a variable length code output 
indicating that the block B is a "zero" quantized block. The 
thresholds used in the zero block locater engine 200c can be 
the same as those set forth above, and can be varied in the 
manner described above. 

Although five specific features have been described and 
illustrated hereinabove, other features can also be used in 
addition to, or in lieu of, the five features described above. 
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EXAMPLE 1 by the foregoing illustrative details, but rather is to be 

defined by the appended claims. 

The zero block locater engine 200/) of FIG. 7 was What is claimed is: 

modified so that only two features, the SAE and MAX 1. A method of encoding video data, comprising the steps 

features, are used. The joint use of SAE and MAX is 0 £. 

believed to be desirable because SAE and MAX are the least " . . . m ffi , m „ a , 

computationally intensive features, SAE has exhibited the performing motion estimation on a sequence of frames of 

best performance, and MAX is a good indicator of the ^ deo data each frame comprised of a plurality of 

presence of any edges. In this example, the threshold for blocks > each block containing values representative of 

SAE was selected to be 20*QP, while the threshold for MAX the vldeo data 10 be encoded; 

was selected to be KQP, with K equal to 15 /i6. 10 classifying each frame as either an I frame, a P frame, or 

The effect of using MAX and SAE jointly to classify input a B frame; 

blocks is illustrated in FIG. 9. In FIG. 9, the SAE and MAX determining a quantization size number for the block or a 

values are plotted for all the blocks while coding the group of the plurality of blocks; 

sequence earphone (a standard test sequence used for testing determining whether a given block in the frame is a zero 

video coderdecoder) at 20 Kbps. FIG. 9 shows that SAE and 15 ntized block using an optimal mreshold value; and 

MAX are highly correlated to each other. In fact, the actual . 7. . J L i 

correlation between the two feature vectors (Corr(SAE, performing DCT, quantization, zig-zag scan and variable 

MAX)) is 0.8445. This implies that use of the two features len gth coding on the given block if the given block is 

will result in the same decision most of the time. not a zero quantized block; 

Next, the result of using MAX and SAE jointly for 20 wherein the optimal threshold value is determined experi- 

classification is investigated. FIG. 10 shows that adding mentally using a plurality of experimental blocks and a 

MAX as an additional feature to SAE helps little in reducing plurality of experimental threshold values, and wherein 

the potential for erroneously misclassifying a block. FIG. 10 an experimental result of encoding the plurality of 

shows the distribution of "zero" blocks and "one" blocks in experimental blocks using each of the plurality of 

an SAE-MAX two-dimensional feature space, and in 2 5 experimental threshold values is compared with an 

particular, shows that the two distributions heavily overlap actual result of encoding the plurality of blocks without 

each other. Thus, using SAE alone and using SAE and MAX the plurality of experimental threshold values, 

jointly will produce almost the same performance. 2 . The method of claim 1, wherein said step of determin- 

t?Y nvipt c i mg wnetner me gi ven block in the frame is a zero quantized 

EXAMPLE 2 3Q block further includes the following steps: 

The zero block locater engine 200a of FIG. 6, utilizing an determining the mean absolute error of all the values in 

SAE engine 210, was used to determine the actual amount the given block; and 

of computational savings obtained while coding the comparing the mean absolute error with a predetermined 

sequence earphone at 20 Kbps. An SAE threshold of 20QP threshold. 

was used. It was noted that more than sixty percent of the 35 3 ^ me thod of claim 1, wherein said step of determin- 

blocks were classified as "zero" quantized blocks, for which ing whether the given block in the frame is a zero quantized 

the DCT, quantization, zig-zag scan, variable length coding, bbck mrt h er includes the following steps: 

IDCT and inverse quantization steps can be omitted. determining the mean square error of all the values in the 

The computations required by the present invention in mven bi oc k ; an( j 

calculating the features do not significantly increase the 40 A . * ... , , . , 

overall computation load of the video encoder 12a, espe- comparing the mean square error with a predetermined 

cially if SAE and MAX are used as the features. First, SAE ^ threshold. , . 

and MAX are not computationally intensive. Calculating 4 - ^ method of claim wherein said step of determin- 

MAX involves 64 comparisons per block (i.e., for a block ™& wh 5 ther lh . e ^ n bl L 0C * * the . frame 15 a zero <l uantized 

having a size of 64 pels), while SAE computation requires 45 block farthcr includes the following steps: 

64 additions per block. Second, it should be noted that, in determining the block variance of all the values in the 

any motion compensated video coding, during the motion given block; and 

estimation step 30, SAE is calculated for a 16x16 block. In comparing the block variance with a predetermined 

addition, in the H.263 standard, SAE must be calculated for threshold. 

an 8x8 block if the advanced prediction mode is used. Thus, 5 me thod of claim 1, wherein said step of determin- 

using SAE as a selected feature would require no additional ing wne ther the given block in the frame is a zero quantized 

computations. D l oc k fanner includes the following steps; 

Although the present invention has been described and determining the mean absolute difference of all the values 

illustrated as being implemented in hardware by employing m me gi ven block; and 

engines or circuits, it is also possible to implement the ^ 

present invention in software. " K , f. . . , v 

r , , , , . mined threshold. 

Although certain eng.nes circuits, components, 6 ^ metho(J of dajm j sak) of delermin . 

subsystems, and blocks have been described above as ■ whe , hcr , hc iven b , ock m , bc framc j, a ^ quantizcd 

including certain ° leme f • * ™ u be appreciated by those b * ck funher in * u(|es ^ foUowi 

skilled in the art that such disclosures are non-limiting, and . . . . . ,~. , . . 

that different elements, or combinations thereof, cln be « determimng the maximum of all the values in the given 

provided for such engines, circuits, components, block; and 

subsystems, and blocks without departing from the spirit and comparing the maximum with a predetermined threshold, 

scope of the present invention. 7. The method of claim 1, wherein said step of determin- 

It will be recognized that the above described invention ing whether the given block in the frame is a zero quantized 

may be embodied in other specific forms without departing 65 block further includes the following steps: 

from the spirit or essential characteristics of the disclosure. determining the mean absolute error of all the values in 

Thus, it is understood that the invention is not to be limited the given block; 
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comparing the mean absolute error with a first predeter- 
mined threshold; 

determining the mean square error of all the values in the 
given block; 

comparing the mean square error with a second predeter- 
mined threshold; 

determining the block variance of all the values in the 
given block; 

comparing the block variance with a third predetermined 
threshold; 

determining the mean absolute difference of all the values 

in the given block; 
comparing the mean absolute difference with a fourth 

predetermined threshold; 
determining the maximum of all the values in the given 

block; 

comparing the maximum with a fifth predetermined 
threshold; and 

classifying the given block as a zero-quantized block if 
the mean absolute value, the mean square error, the 
block variance, the mean absolute difference and the 
maximum are all less than or equal to each of their 
respective thresholds. 

8. The method of claim 1, wherein said step of determin- 
ing whether the given block in the frame is a zero quantized 
block further includes the following steps: 

selecting a desired feature to be used in determining 
whether the given block is a zero quantized block; 

computing the value of the selected feature for the given 
block; and 

comparing the value of the selected feature for the given 
block with a threshold. 

9. The method of claim 8, wherein the desired feature can 
be selected from a group consisting of: 

the mean absolute value of the given block; 
the mean square error of the given block; 
the block variance; 

the mean absolute difference of the given block; and 
the maximum value in the given block. 

10. The method of claim 1, 

wherein said step of determining whether the given block 
in the frame is a zero quantized block further includes 
the following steps: 

calculating a value for a first feature of all the values in 

the given block; 
comparing the value of the first feature with a first 

predetermined threshold; 
calculating a value for a second feature of all the values 

in the given block; 
comparing the value of the second feature with a 

second predetermined threshold; and 
classifying the given block as a zero-quantized block if 

the values of the first and second features are all less 

than or equal to each of the first and second 

thresholds, respectively. 

11. The method of claim 1, wherein said step of deter- 
mining whether the given block in the frame is a zero 
quantized block further includes the step of: 

generating a first output indicative of the detection of a 
zero quantized block. 

12. The method of claim 11, further including the step of: 
generating a bitstream output representing the coded 

values of the given block. 

13. The method of claim 1, wherein said step of deter- 
mining whether the given block in the frame is a zero 
quantized block further includes the following steps: 
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computing a value for a feature for the given block; 
comparing the value of the feature for the given block 

with a first threshold; 
changing the first threshold to a second threshold; 
computing a value for the feature for the next given block; 

and 

comparing the value of the feature for the next given 
block with the second threshold. 

14. A video encoder for processing a sequence of frames 
of video data, each frame comprising a plurality of blocks, 
with each block containing values representative of the 
video data to be encoded, comprising: 

a motion estimation engine having an input for receiving 

the sequence of frames, and an output; 
a frame type decision engine having an input coupled to 

the output of the motion estimation engine, and an 

output; 

a QP decision engine having an input coupled to the 
output of the frame type decision engine, and an output; 

a zero quantized block locator having an input coupled to 
the output of the QP engine, the zero quantized block 
locator operating to determined whether a given block 
is a zero quantized block using an optimal threshold 
value determined experimentally by using a plurality of 
experimental blocks and a plurality of experimental 
threshold values; the zero quantized block locator pro- 
ducing a first output indicative of the detection of a zero 
quantized block, and a second output representative of 
the given block; 

a DCT engine having an input coupled to the second 
output of the zero quantized block locator for receiving 
the values of the given block, the DCT engine further 
including an output; 

a quantization engine having an input coupled to the 
output of the DCT engine, and an output; and 

a variable length coding engine having an input coupled 
to the output of the quantization engine, and a bit 
stream output representing the coded values of the 
given block; 

wherein an experimental result of encoding the plurality 
of experimental blocks using each of the plurality of 
experimental threshold values is compared with an 
actual result of encoding the plurality of blocks without 
using the plurality of experimental threshold values. 

15. The video encoder of claim 14, wherein the zero 
quantized block locater further includes: 

an engine for computing the mean absolute error of all the 

values in the given block; and 
a comparator for comparing the mean absolute error with 

a predetermined threshold. 

16. The video encoder of claim 14, wherein the zero 
quantized block locater further includes: 

an engine for computing the mean square error of all the 
values in the given block; and a comparator for com- 
paring the mean square error with a predetermined 
threshold. 

17. The video encoder of claim 14, wherein the zero 
quantized block locater further includes: 

an engine for computing the block variance of all the 

values in the given block; and 
a comparator for comparing the block variance with a 

predetermined threshold. 

18. The video encoder of claim 14, wherein the zero 
quantized block locater further includes: 

an engine for computing the mean absolute difference of 
all the values in the given block; and 
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a comparator for comparing the mean absolute difference 
with a predetermined threshold. 

19. The video encoder of claim 14, wherein the zero 
quantized block localer further includes: 

an engine for determining the maximum of all the values 
in the given block; and 

a comparator for comparing the maximum with a prede- 
termined threshold. 

20. The video encoder of claim 14, wherein the zero 
quantized block locater further includes: 

a first engine for determining the mean absolute error of 
all the values in the given block; 

a first comparator for comparing the mean absolute error 
with a first predetermined threshold; 

a second engine for determining the mean square error of 
all the values in the given block; 

a second comparator for comparing the mean square error 
with a second predetermined threshold; 

a third engine for determining the block variance of all the 
values in the given block; 

a third comparator for comparing the block variance with 
a third predetermined threshold; 

a fourth engine for determining the mean absolute differ- 
ence of all the values in the given block; 

a fourth comparator for comparing the mean absolute 
difference with a fourth predetermined threshold; 

a fifth engine for determining the maximum of all the 
values in the given block; 

a fifth comparator for comparing the maximum with a 
fifth predetermined threshold; and 

means coupled to the first, second, third, fourth and fifth 
comparators for determining whether the mean abso- 
lute error, the mean square error, the block variance, the 
mean absolute difference and the maximum are all less 
than or equal to each of their respective thresholds. 

21. The video encoder of claim 14, 

wherein the zero quantized block locater further includes: 
a selector which selects a desired feature to be used in 
determining whether the given block is a zero quan- 
tized block; 

a first engine for computing the value of a first feature 

for the given block; 
a first comparator for comparing the value of the first 

feature for the given block with a first threshold; 
a second engine for computing the value of a second 

feature for the given block; 
a second comparator for comparing the value of the 

second feature for the given block with a second 

threshold; 

wherein the selector is coupled to the first and second 
engines for selectively enabling one of the first 
engine or the second engine. 

22. The video encoder of claim 21, wherein the first and 
second features can be selected from a group consisting of: 

the mean absolute value of the given block; 
the mean square error of the given block; 
the block variance; 

the mean absolute difference of the given block; and 
the maximum value in the given block. 

23. The video encoder of claim 14, wherein the zero 
quantized block locater further includes: 

a first engine for calculating a first feature of all the values 

in the given block; 
a first comparator for comparing the value of the first 

feature with a first predetermined threshold; 
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a second engine for calculating a second feature of all the 

values in the given block; 
a second comparator for comparing the value of the 

second feature with a second predetermined threshold; 
means for classifying the given block as a zero-quantized 

block if the values of the first and second features are 

all less than or equal to each of the first and second 

thresholds, respectively. 

24. A method of encoding video data, comprising the 
steps of: 

performing motion estimation on a sequence of frames, 
each frame comprising a plurality of blocks; 

classifying each frame as either an I frame, a P frame, or 
a B frame; 

determining the quantization step size for each block or 

group of blocks; 
classifying each block as a zero quantized block or a 

non-zero quantized block using an optimal threshold 

value; 

performing DCT, quantization, zig-zag scan and variable 
length coding on the non-zero quantized blocks; and 

skipping the DCT, quantization, zig-zag scan and variable 
length coding steps for the zero quantized blocks; 

wherein the optimal threshold value is determined experi- 
mentally using a plurality of experimental blocks and a 
plurality of experimental threshold values, and wherein 
an experimental result of encoding the plurality of 
experimental blocks using each of the plurality of 
experimental threshold values is compared with an 
actual result of encoding the plurality of blocks without 
using the plurality of experimental threshold values. 

25. A method of encoding video data, the method com- 
prising the steps of: 

performing motion estimation on a sequence of frames, 
each frame comprising a plurality of blocks; 

determining the quantization step size for each block of 
the plurality of blocks; 

selecting a feature of each block of the plurality of blocks; 

selecting a constant value associated with the feature; 

generating a threshold value by multiplying the quanti- 
zation step size by the constant value; 

classifying each block as a zero quantized block or a 
non-zero quantized block based on a comparison of the 
feature with the threshold value; 

performing DCT, quantization, zig-zag scan and variable 
length coding on the non-zero quantized blocks; and 

skipping the DCT, quantization, zig-zag scan and variable 
length coding steps for the zero quantized blocks; 

wherein the constant value is determined experimentally 
using a plurality of experimental blocks and a plurality 
of experimental constant values, and wherein an 
experimental result of encoding the plurality of experi- 
mental blocks using each of the plurality of experimen- 
tal threshold values is compared with an actual result of 
encoding the plurality of blocks without using the 
plurality of experimental threshold values. 

26. The method of claim 25, wherein the feature is one of 
mean absolute value of the block, mean square error of the 
block, variance of the block, mean absolute difference of the 
block or maximum value in the block. 
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